Finding Your Way through Blogspace: Using Semantics for Cross-Domain Blog Analysis
نویسندگان
چکیده
Blogspace is one of the most dynamic areas of today’s Internet, and it is increasingly recognised that blogs are much more than “meaningless chatter”. Many syntaxbased approaches exist to analyse the text and the network structure between blogs. While this is very helpful for purposes such as the detection of discussion bursts concerning uniquely-named topics (e.g., a book, product, or person), it is insufficient for understanding blogs discussing new phenomena in different wordings, or for finding and explaining relationships between new discourse topics or the context of a new topic in a larger domain of discourse. In this paper, we propose two methods for semantics-enhanced blogs analysis that allow the analyst to integrate domain-specific as well as general background knowledge. The methods rely on the Term Extractor for identifying keyphrases (Navigli & Velardi, 2004), SSI (Structural Semantic Interconnections) for disambiguating terms (Navigli & Velardi, 2005), and the taxonomy of domain labels by (Magnini & Cavaglià, 2000). Applications include topic detection and grouping, the proposal of blog tags and the forming of blog directories, and blog recommender systems. To illustrate the usefulness of our approach, we present a detailed experimental analysis of a sample of four sets of blogs with different thematic foci (food, health, law, and weblogs about blogging).
منابع مشابه
Bloggers during the London attacks: Top information sources and topics
Blogs are probably most associated with the high profile postings of a few highly popular bloggers who debate or comment on major news stories, but for each ‘A-lister’ there are numerous faceless bloggers who write about their own daily lives and/or interests. Hence it is interesting to investigate the extent to which an event with extensive media coverage, such as the London attacks, is reflec...
متن کاملThe Domain of the semantics of ‘promise’ in the Holy Quran
Semantics is a part of linguistic by which it can be analyzed the meaning of the words and sentences of a text and identified the part of speech with regard to semantics. This is a descriptive-analytic research and it deals with studying the meaning of ‘promise’ in the Holy Quran based on principles of semantics with a collocation approach by library methodology. Also, by virtue of ...
متن کاملTraffic Characteristics and Communication Patterns in Blogosphere
We present a thorough characterization of the access patterns in blogspace – a fast-growing constituent of the content available through the Internet – which comprises a rich interconnected web of blog postings and comments by an increasingly prominent user community that collectively define what has become known as the blogosphere. Our characterization of over 35 million read, write, and admin...
متن کاملBlogGrid: Towards an Efficient Information Pushing Service on Blogspace
With increasing concerns about the personalized information space, users have been posting various types of information on their own blogs. Due to the domain-specific properties of blogging systems, however, searching relevant information is too difficult. In this paper, we focus on analyzing the user behaviors on blogspace, so that the channel between two similar users can be virtually generat...
متن کاملMoodViews: Tools for Blog Mood Analysis
We demonstrate a system for tracking and analyzing moods of bloggers worldwide, as reflected in the largest blogging community, LiveJournal. Our system collects thousands of blog posts every hour, performs various analyses on the posts and presents the results graphically. Exploring the Blogspace From the point of view of information access, the blogspace offers many natural opportunities beyon...
متن کامل